<HTML>

<HEAD>
<title>Performance notes: sparse_hash, dense_hash, sparsetable</title>
</HEAD>

<BODY>

<H2>Performance Numbers</H2>

<p>Here are some performance numbers from an example desktop machine,
taken from a version of time_hash_map that was instrumented to also
report memory allocation information (this modification is not
included by default because it required a big hack to do, including
modifying the STL code to not try to do its own freelist management).</p>

<p>Note there are lots of caveats on these numbers: they may differ from
machine to machine and compiler to compiler, and they only test a very
particular usage pattern that may not match how you use hashtables --
for instance, they test hashtables with very small keys.  However,
they're still useful for a baseline comparison of the various
hashtable implementations.</p>

<p>These figures are from a 2.80GHz Pentium 4 with 2G of memory.  The
'standard' hash_map and map implementations are the SGI STL code
included with <tt>gcc2</tt>.  Compiled with <tt>gcc2.95.3 -g
-O2</tt></p>

<pre>
======
Average over 10000000 iterations
Wed Dec  8 14:56:38 PST 2004

SPARSE_HASH_MAP:
map_grow                  665 ns
map_predict/grow          303 ns
map_replace               177 ns
map_fetch                 117 ns
map_remove                192 ns
memory used in map_grow    84.3956 Mbytes

DENSE_HASH_MAP:
map_grow                   84 ns
map_predict/grow           22 ns
map_replace                18 ns
map_fetch                  13 ns
map_remove                 23 ns
memory used in map_grow   256.0000 Mbytes

STANDARD HASH_MAP:
map_grow                  162 ns
map_predict/grow          107 ns
map_replace                44 ns
map_fetch                  22 ns
map_remove                124 ns
memory used in map_grow   204.1643 Mbytes

STANDARD MAP:
map_grow                  297 ns
map_predict/grow          282 ns
map_replace               113 ns
map_fetch                 113 ns
map_remove                238 ns
memory used in map_grow   236.8081 Mbytes
</pre>


<H2><A name="hashfn">A Note on Hash Functions</A></H2>

<p>For good performance, the sparsehash hash routines depend on a good
hash function: one that distributes data evenly.  Many hashtable
implementations come with sub-optimal hash functions that can degrade
performance.  For instance, the hash function given in Knuth's _Art of
Computer Programming_, and the default string hash function in SGI's
STL implementation, both distribute certain data sets unevenly,
leading to poor performance.</p>

<p>As an example, in one test of the default SGI STL string hash
function against the Hsieh hash function (see below), for a particular
set of string keys, the Hsieh function resulted in hashtable lookups
that were 20 times as fast as the STLPort hash function.  The string
keys were chosen to be "hard" to hash well, so these results may not
be typical, but they are suggestive.</p>

<p>There has been much research over the years into good hash
functions.  Here are some hash functions of note.</p>

<ul>
  <li> Bob Jenkins: <A HREF="http://burtleburtle.net/bob/hash/">http://burtleburtle.net/bob/hash/</A>
  <li> Paul Hsieh: <A HREF="http://www.azillionmonkeys.com/qed/hash.html">http://www.azillionmonkeys.com/qed/hash.html</A>
  <li> Fowler/Noll/Vo (FNV): <A HREF="http://www.isthe.com/chongo/tech/comp/fnv/">http://www.isthe.com/chongo/tech/comp/fnv/</A>
  <li> MurmurHash: <A HREF="http://murmurhash.googlepages.com/">http://murmurhash.googlepages.com/</A>
</ul>

</body>
</html>
